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Multi- vehicle  Cooperative  Search  with  Uncertain  Prior  Information 


Chunlei  Zhang  *  Raul  Ordonez  *  Corey  Schumacher  ^ 


Abstract 

We  present  a  possible  solution  for  the  multi-vehicle  cooperative  search  problem  via  the  surrogate  op¬ 
timization  method.  We  establish  a  discrete  mathematical  model  for  the  multi- vehicle  cooperative  search 
stationary  target  with  uncertain  prior  information.  We  reformulate  the  Past,  Present  and  Predicted 
Future  (PPP)  algorithm  we  introduced  in  previous  work  based  on  the  new  model  and  propose  an  imple¬ 
mentation  to  adapt  to  the  new  scenario.  Monte  Carlo  simulations  indicate  that  the  new  implementation 
achieves  better  performance.  Moreover,  the  scalability  of  the  PPP  algorithm  is  addressed. 

tr 


1  Introduction 


The  idea  of  multiple  uninhabited  autonomous  vehicles  (UAVs)  able  to  adaptively  react  to  their  environment 
and  learn  about  their  surroundings  while  following  either  an  individual  or  a  communal  agenda  is  an  intriguing 
issue.  Achieving  such  a  degree  of  control  and  producing  such  sophisticated  behavior  remains  an  elusive  goal 
that  presents  considerable  challenges  due  to  the  inherent  complexity  of  the  task,  and  also  because  it  may 
be  approached  from  a  variety  of  different  angles.  The  problem  of  multi-vehicle  coordination  and  control  has 
been  receiving  an  extraordinary  amount  of  attention  during  the  past  few  years  due  to  its  critical  importance 
for  a  myriad  of  applications. 

Existing  work  on  multi-vehicle  control  focuses  on  receding-horizon  planning  (that  is,  optimization  methods) 
and  hierarchical  structures.  The  research  reported  in  this  paper  benefits  from  previous  work  that  follows  the 
first  approach.  A  receding  horizon  trajectory  planner  based  on  Mixed  Integer-Linear  Programming  (MILP) 
that  is  capable  of  planning  planar  trajectories  to  a  goal  constrained  by  no-fly  areas,  or  obstacles,  and  aircraft; 
dynamics  were  proposed  in  [1,  2,  3].  A  generalized  multi-vehicle  formation  stabilization  problem,  free  from 
a  leader-follower  architecture  is  defined  in  [4]  and  model  predictive  control  (MPC)  is  applied. 

Game  theory  based  cooperative  decision  making  for  multi-vehicle  include  (5,  6]  and  [7].  Moreover,  graph 
theory  is  also  employed  extensively  in  multi-vehicle  coordination.  A  disjoint  path  algorithm  for  reconfigu¬ 
ration  of  multi-vehicle  was  proposed  in  [8],  A  class  of  triangulated  graphs  for  algebraic  representation  of 
formations  are  introduced  to  specify  a  mission  cost  for  a  group  of  vehicles  (9),  then  the  obtained  optimal 
control  problem  is  solved  using  NTG  (an  optimal  control  program  developed  at  Caltech).  A  double-graph 
model  is  used  in  [10]  to  treat  the  string  stability  as  a  kind  of  performance  of  multi-vehicle  system  with  acyclic 
formation  structures.  Moreover,  a  theoretical  explanation  of  using  nearest  neighbor  rules  in  coordinating 
groups  of  mobile  autonomous  agents  can  be  found  in  [11]. 

Related  to  the  work  on  coordination  of  multi- vehicle  are  swarms  and  formation  control.  In  [12,  13,  14,  15], 
the  authors  considered  a  swarm  which  moves  in  an  attractant/repellent  profile  (i.e.,  a  profile  of  nutrients 

•This  work  was  supported  with  AFRL/AROSR  grant  No.  F336 15-0 1  -C-3 154.  In  addition  the  work  was  supported  by  DAGSI. 
C.  Zhang  and  R.  Orddhez  are  with  the  Department  of  Electrical  and  Computer  Engineering,  University  of  Dayton,  Dayton, 
OH,  45469-0226  USA.  (Phone:  +937  229-3183.  E)-mail:  ordonez@ieee.org.) 

IC.  Schumacher  is  with  the  Control  Science  Division,  Air  Force  Research  Laboratory  (AFRL/VACA),  Wright-Pat terson 
AFB,  OH,  45433-7531  USA. 
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or  toxic  substances)  and  showed  collective  convergence  to  (divergence  from)  more  favorable  (unfavorable) 
regions  of  the  profile.  The  inter-individual  interactions  and  the  interactions  with  the  environment  in  the 
model  in  these  articles  used  artificial  potential  functions^  a  concept  that  has  been  used  extensively  for  robot 
navigation  and  control  (16,  17].  The  results  in  [18,  19,  20,  21,  22]  are  based  on  using  virtual  leaders  and 
artificial  potentials  for  vehicle  interactions  in  a  group  to  maintain  the  group  geometry.  They  use  the  system 
kinetic  energy  and  the  artificial  potential  energy  as  a  Lyapunov  function  to  prove  closed  loop  stability  and 
employ  a  dissipative  term  to  achieve  asymptotic  stability  of  the  formation.  Other  results  using  artificial 
potential  functions  include  [23],  where  the  authors  consider  a  distributed  control  approach  for  groups  of 
robots,  called  the  social  potential  fields  method,  which  is  based  on  artificial  force  laws  between  individual 
robots  and  robot  groups.  The  force  laws  are  inverse-power  or  spring  force  laws  incorporating  both  attraction 
and  repulsion.  In  [24],  the  dynamics  of  the  vehicle  group  is  decomposed  into  average  system  and  shape 
system.  The  internal  formation  control  is  achieved  by  controlling  the  shape  system,  while  the  maneuvering 
control  is  achieved  by  the  controlling  of  the  average  system.  Furthermore,  the  role  of  information  flow  in 
formation  stability  was  studied  in  [25],  similar  work  can  be  found  in  [26].  A  more  recent  overview  of  formation 
control  can  be  found  in  [27]. 

This  paper  is  based  on  previous  research  [28]  and  organized  as  follows.  The  next  section  begins  with  the 
formulation  of  our  research  problem,  a  discrete  mathematical  model  of  multi-vehicle  cooperative  search  for 
stationary  targets  with  uncertain  prior  information  is  introduced,  and  surrogate  optimization  [29,  30]  is 
applied  to  the  cooperative  search  problem.  We  reformulate  the  PPP  algorithm  introduced  in  [28]  in  Section 

3.  A  specific  implementation  of  the  PPP  algorithm  to  fully  use  the  prior  information  is  proposed  in  Section 

4.  There,  Monte  Carlo  simulations  are  used  to  comi>are  these  proposed  schemes  with  exhaustive  search  and 
the  scalability  of  PPP  algorithm  is  discussed.  Finally,  Section  5  concludes  the  paper  and  discusses  some 
future  works. 


2  Problem  Formulation 

2.1  Model 


For  convenience,  we  concentrate  on  discrete  time  models  within  a  two-dimensional  plane,  since  the  setup 
fits  nicely  within  the  simulation  environment.  We  assume  whatever  constraints  exist  for  vehicles  to  be  not 
dynamic,  but  rather  kinematic  in  favor  of  focusing  on  high-level  mechanisms  originating  from  the  group  of 
vehicles.  We  will  assume  there  are  m  vehicles  and  that  the  i^^  one  obeys  a  discrete  time  kinematic  model 
given  by 

+  1)  =  +  dcos{ei{k)) 

+  1)  =  +  dsin{ei{k)) 

6i{k+i)  =  ei{k)  +  f0AAk))  (1) 


where  k  is  the  discrete  time  index  taking  values  in  the  nonnegative  integers  {0, 1,2,.. .}  (k  ^  denotes  the 
number  of  search  steps);  xj,,  and  xt,  are,  respectively,  the  two  Cartesian  coordinate  of  i‘  whicle;  d  is  a 
constant  step  size;  0^  is  the  orientation  of  the  i*'*  vehicle;  can  be  a  nonlinear  function  encoding  kinematic 
restrictions  on  the  vehicles;  and  is  the  local 'controller  corresponding  to  the  t*'‘  vehicle.  For  convenience, 

let  xi,  =  [xi,.,xt,r,  and  xt  =  I(xt,)T,flir  . 

In  some  applications  of  cooperative  search,  we  may  have  access  to  prior  information  (e.g.,  where  the  targets 
are)  but  with  uncertainty.  The  environment  is  modeled  as  a  two-dimensional  plane,  the  upper  right  quadrant 
of  a  Cartesian  coordinate  system  with  axes  (xi,  Xa).  We  may  set  up  a  Gai^ian  profile  map  which  is  known 
to  all  vehicles.  The  Gaussian  profile  encodes  the  possible  target  locations  x^  =  »*  =  ...  ,n  offered 

by  the  prior  information  as  centers  of  the  Gaussian  peaks,  where  we  assume  to  know  the  number  n  of  targets. 


n 

Mp(xi,  X2,  k  =  0)  =  5])  Ci  exp 


(xi-x«.)^-f(xa-xij^1 

W  J 


(2) 
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A  possible  picture  is  in  Figure  1  (a),  where  x*  are  [6,46]'^,  [50, 6]^  and  [40,40]'^.  We  encode  the  uncertainty 
of  the  prior  information  with  the  peak  width  Vi  and  the  distance  of  the  real  (but  unknown)  target  position 
to  the  center  of  the  peak  in  terms  of  Vi.  For  example,  we  can  say  the  real  target  is  O.Suj,  Vj,  2vu  etc.,  distance 
units  from  the  center,  that  is,  we  consider  the  worst  case  scenario  for  each  uncertainty  level  (e.g.,  given  a  fixed 
peak  width  Vj  =  4  as  in  Figure  1  (a),  the  blue  star  represents  the  real  target,  which  is  at  one  Vi  uncertainty 
from  the  center).  Furthermore,  we  can  intentionally  encode  the  priority  level  of  each  target  as  the  height 
Ci  of  the  peak  (e.g.,  we  assign  Cj  =  3  to  the  peak  center  located  at  (40,40)  as  the  most  important  target). 
Thereby,  we  expect  the  vehicles  to  find  the  most  important  target  first,  and  the  whole  search  performance 
should  coincide  with  the  uncertainty  level  about  the  prior  information.  Overall,  in  this  kind  of  scenario  all 
the  vehicles  share  Mp{xiyX2),  meanwhile  the  vehicle  sensor  will  sample  a  real  target  map  A/^ (a;  1,0:2)  (Figure 

1  (b)), 


Mt 


[xuXir  e 

otherwise 


(3) 


and  only  obtain  two  kinds  of  information:  0,  which  means  no  target,  and  1,  means  the  vehicle  found  a  target. 


fctUal  Prtof  InfociMflon  Map 


RaalTarpatMap 


Figure  1:  (a)  Prior  information  map,  (b)  Real  target  map  (c)  Cell  based  map  assumptions. 


We  discretize  not  only  the  vehicles’  movement,  but  also  the  proposed  map  as  shown  in  Figure  1  (a).  We 
will  let  the  vehicles  move  from  cell  to  cell  (i.e.,  the  center  of  one  cell  to  another)  rather  than  move  along 
a  smooth  curve,  and  therefore  we  will  start  with  the  assumption  that  all  vehicles  move  at  constant  speed 
and  in  discrete  time  index.  Also  we  assume  that  maximum  turn  angles  are  ±135  degrees,  i.e.,  becomes 
a  saturation  function.  Figure  1  (c)  illustrates  these  points:  the  triangle  in  the  middle  cell  represents  the 
vehicle’s  current  position  with  its  orientation  0^  =  0;  with  the  assumptions  of  constant  speed  and  discrete 
time  index,  the  vehicle  will  only  move  one  cell  each  time  index;  and  with  the  assumption  of  maximum 
angles  of  turn,  the  vehicle  will  only  have  seven  possible  cells  to  go,  which  are  denoted  by  the  circles,  and  the 
cross  denotes  the  cell  which  vehicle  cannot  visit  since  By  =  0.  Moreover,  we  ignore  the  difference  between 
diagonal  step  size  with  vertical  or  horizontal  step  size,  all  denoted  as  d  for  convenience.  The  inter- vehicle 
communications  are  instantaneous,  noiseless  and  have  unbounded  communication  distance,  i.e.,  we  assume 
perfect  communication.  These  assumptions  together  with  further  ones  made  for  the  Monte  Carlo  simulation 
will  be  used  throughout  this  paper.  Note  that  this  is  not  a  centralized  coordination  scheme,  since  if  non- 
ideal  communication  were  considered,  each  vehicle  would  have  its  own  verison  of  and  would  act  upon  it, 
sharing  as  much  information  as  allowed  by  the  communication  channel.  The  level  of  coordination  increases 
as  the  quality  of  the  communication  channel  improves. 


A  vehicle  located  somewhere  in  the  terrain  is  clearly  unable  to  observe  it  all  at  once  in  most  practical 
situations.  In  general,  then,  the  vehicle  is  only  able  to  sample  a  point  of  the  discrete  map  (3)  at  its  own 
location,  to  the  extent  that  its  own  sensors  or  some  external  information  source  allow  it.  Therefore,  we  may 
define  the  system  output  vector  as 


y  =  [y\.. .  . .  .,M^{xZ,xZ)V 


(4) 


where  yi  is  the  output  corresponding  to  the  vehicle  sampling  the  proposed  map  function  (3).  Each  vehicle 
can  update  the  prior  information  map  (2)  with  its  collected  sensor  information  at  time  index  A:, 

Mp(xi,X2ik-\-l)  =  Mp{xi,X2ik)  •  Ma{xi,X2,k-h  1)  (5) 
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where  •  means  element-wise  multiplication  for  matrices  and  Ma(rci,a:2,  A:  -f  1)  is  called  sensor  matrix,  which 
encodes  the  current  time’s  sensor  information  (each  time,  Mg  is  reset  to  a  matrix  whose  elements  are  1  and 
then  the  places  corresponding  to  the  current  vehicle  locations  are  filled  with  the  sensor  data  “0”  or  “1”). 
As  available  a  priori,  Mp  could  be  used  by  the  decentralized  controller  i/{-)  to  implement  output  feedback 
regulation  on  system  (1),  in  which  the  search  and  engagement  tasks  may  be  phrased  as  the  regulation 
objective  of  “flattening  out”  the  map  in  Figure  1  (b)  (in  other  words,  each  time  a  target  is  destroyed,  the 
corresponding  peak  would  vanish).  A  general  expression  of  control  law  under  perfect  communication  is 

=  fiiMp(,xi,X2,k),x„{k),y{k))  (6) 

where  all  vehicles  have  the  same  prior  information  map  Mp(xi,X2,  k),  and  the  vehicle  controller  is  able  to 
use  not  only  its  own  measurements  and  position,  but  also  those  from  all  other  vehicles. 


2.2  Cooperative  Search  using  Surrogate  Optimization 


Here,  we  attempt  to  apply  surrogate  optimization  method  in  multi- vehicle  cooperative  search  for  stationary 
targets  (28,  29,  30].  In  particular,  the  objective  function  will  represent  the  unknown  real  target  map  Mt  in 
equation  (3).  Vehicles  actuate  on  the  system  by  finding  and  destroying  targets,  thereby  “flattening”  Mf .  As 
each  vehicle  moves  about  and  collects  data,  it  is  able  to  refine  Mp  in  Equation  (2),  which  will  play  the  role 
of  the  surrogate  function  for  the  vehicles.  The  vehicle,  however,  is  unable  to  instantaneously  jump  between 
locations,  and  has  instead  to  move  along  a  trajectory  towards  the  specified  point.  Along  the  way  it  continues 
gathering  more  data,  thereby  potentially  improving  Mp.  Moreover,  using  the  idea  of  merit  function  (detailed 
in  next  section),  we  further  refine  the  general  control  law  (6)  to 

=  ftiTa.eni'{xi,X2,k),Xu{k),y(k))  (7) 

where  we  wish  to  achieve  coordinated  behavior  via  a  series  of  well  constructed  merit  functions. 


In  this  manner,  the  surrogate  optimization  method  is  modified  to  effectively  become  a  control  law:  when  a 
maximum’s  location  is  predicted,  the  controller  computes  the  direction  in  which  it  needs  to  move  fronr  its 
current  location  in  order  to  reach  the  desired  point.  Mathematically,  the  control  law  (7)  can  be  expressed  as 


«‘(fc)  =  [  arctan  )]  ,  “ 

where  ^  ^  maximum  (predicted  target)  of  the  merit  function  merit*(xi,X2,  A;),  i.e.. 


merit‘(a:‘  ,x’  ,fc)  =  max  merit‘(ii,X2.*) 


(8) 

(9) 


The  operator  ^  •  j  is  a  quantization  operation,  which  quantizes  the  angle  arctan  hito  one  of  the 

elements  from  the’set  {0,  ±45,  ±90,  ±135,  ±180}.  Finally,  will  be  a  saturation  function  taking  care  of 
the  maximum  turn  angles  restriction. 


/».(«*(*!))  =  I 


u\k), 

sgn(«<(fc))135«. 


-135®  <  u*{k)  <  135® 
otherwise 


3  PPP  Behavior  Based  Coordination 


In  order  to  realize  a  meaningful  level  of  coordination  and  given  that  we  have  m  vehicles,  where  rn  >  2,  we 
utilize  the  idea  of  surrogate  function  and  distance  function  to  form  certain  behaviors  for  each  vehicle.  And 
therefore  we  name  this  coordination  scheme  as  behavior  based  coordination. 


4 


If  the  vehicle  uses  surrogate  map  Mp  to  predict  a  target  position,  we  say  that  the  vehicle  is  in 
converge  behavior  (abbreviated  as  “C”).  Moreover,  each  vehicle  will  have  a  predefined  serpentine  search 
path  to  promise  high  detection  probability  with  high  search  effort,  which  we  name  exhaustive  search  behavior 
(abbreviated  as  “ES”).  Moreover,  we  extend  the  distance  function  introduced  in  [29]  into  two  dimensions, 
rewriting  it  as 


D(a;i,X2,fc)=  min  { 

l<t<mA: 


+ (2:2  “2:2^2} 


(10) 


which  evaluates  the  Euclidean  distance  from  every  map  cell  position  (x  1,2:2)  to  the  nearest  vehicles’  visited 
site,  where  [xi.,X2i]^  €  {2:vp(l), . . •  i2^t;p(A:)}  and  ...  is  the  combination  of  all  the 

vehicles  visited  sites,  we  call  this  kind  of  distance  function  ‘‘Global  Distance  Eunction”  (refer  to  Figure  2 
(a)).  The  distance  function  is  a  measurement  of  uncertainty  in  the  sense  of  Euclidean  distance.  D(xi,X2,  A:) 
represents  the  uncertainty  about  the  vehicle’s  knowledge  of  the  terrain  at  k  time  index  (in  the  sense  of 
Euclidean  distance),  which  is  an  experimental  design  criterion  that  is  intended  to  inhibit  clustering  of  vehicles 
and  thereby  to  ensure  that  the  vehicles’  locus  will  spread  all  over  the  search  terrain.  We  can  consider  that 
targets  should  be  looked  for  in  the  most  uncertain  places,  so  a  Tnaximum  of  distance  function  D  will  represent 
another  possible  target  location.  Therefore,  if  the  vehicle  uses  a  global  distance  function  to  predict  target 
position,  we  say  that  the  vehicle  is  in  Drive  Search  behavior  (abbreviated  as  “DS”). 


Gtobal  DtoUnc*  Function 


Expwtdod  Global  Diatanc*  Function 


ExpafKfad  Global  Dtstanco  Function 


Figiure  2:  Global  distance  function:  (a)  Global  distance  function,  (b)  Expanded  global  distance  function 
given  two  vehicles’  planned  trajectories, ,  (c)  Expanded  global  distance  function  given  three  vehicles’  planned 
trajectories,  (d)  Projection  of  (a),  (e)  Projection  of  (b),  (f)  Projection  of  (c). 

Here,  we  reformulate  the  PPP  algorithm  as  first  appeared  in  [28].  A  particular  construction  of  the  series  of 
merit  function  for  m  vehicles  can  be:  merit^(xi,X2,  A;)  =  Mp(xi,X2, A;),merit^(xi,X2, A:)  =  D{xi,X2yk)  and 
merit*(xi,X2,  A:)  —  D*{xi,X2,k)  for  3  <  i  <  m.  jD’(xi,X2,  A;)  is  called  expanded  global  distance  fimction 
where  not  only  includes  visited  sites  but  also  the  planned  trajectories  at  time  k  for  the 

previous  f  —  1  vehicles.  Examples  of  expanded  global  distance  functions  can  be  seen  in  Figure  2.  The 
projection  of  distance  functions  (Figures  2  (d),  (e)  and  (f))  to  the  xi  —  X2  plane  gives  a  more  distinctive 
view  of  the  dynamic  computing  of  the  uncertainty  of  the  search  area  (notice  that  it  makes  no  difference  how 
the  different  merit  functions  are  assigned  to  each  vehicle,  but  we  need  a  sequential  negotiation,  because  the 
next  control  force  depends  on  the  former  ones;  the  coordination  is  mainly  displayed  in  the  interaction  among 
vehicles  and  the  complexity  of  the  algorithm  also  lies  here).  Moreover,  we  can  see  clearly  the  elongation  of 
current  trajectories  due  to  the  future  path  plan  in  these  plots;  it  is  the  updating  of  the  uncertainty  map  that 
solves  the  clustering  originated  from  the  distance  function,  since  each  vehicle  is  equipped  with  a  distinct 
decision  maker. 


5 


Figure  5  gives  several  snapshots  of  one  simulation  where  we  have  four  vehicles  doing  cooperative  search  for 
three  stationary  targets  as  demonstrated  in  Figures  1  (a)  and  (b).  As  we  expect,  one  vehicle  will  quickly 
enter  the  Gaussian  peak  as  directed  by  prior  information  map  Mp.  By  incorporating  the  sensor  information 
from  sampling  the  real  target  map  A/t,  we  can  see  an  update  of  Mp  (a  dissipation  of  peaks).  After  a  vehicle 
destroys  the  first  teu’get  along  its  way  to  the  center  of  the  tallest  peak,  another  vehicle  rapidly  heads  to  the 
second  peak.  The  vehicle  behaves  like  in  drive  search  according  to  the  uncertainty  within  the  cone  of  the 
peak  until  it  finds  the  target. 


Pfkx  information  Map:  M  Prior  IntormaUon  Map;  M  ^(x,|r,k-40)  Prior  Information  Map:  M 


Figure  3:  Snapshots  of  the  update  of  prior  information  map  under  lC5vs3DS  PPP  coordination  with  IMF 
(Ideal  Map  Flattening  Assumption  [28]). 

Surrogate  optimization  offers  us  a  method  to  organize  the  previous  and  present  sensor  information,  which 
results  in  the  updated  prior  information  map.  The  essence  of  PPP  coordination  is  the  use  of  predicted 
future  information  to  balance  the  search  effort  or  task  allocation.  The  expanded  global  dist^ce  function  is 
just  one  specific  implementation  of  this  idea,  which  efficiently  manipulates  the  uncertainty  and  theretofore 
inhibits  clustering  introduced  by  the  distance  function.  The  same  idea  can  be  applied  to  the  surrogate 
function,  or  some  other  function  the  vehicles  take  as  a  guidance  map.  We  will  illustrate  this  concept  further 
in  Section  4.  The  idea  of  efficient  utilization  of  past,  present  and  predicted  future  information  becomes  the 
solid  foundation  of  the  various  schemes  throughout  the  paper. 


4  PPP  Behavior  Based  Coordination  with  Prior  Information 


4.1  Multiple- Attack  PPP  with  Prior  Information 


Given  that  we  have  uncertain  prior  information  about  the  possible  target  locations  Xg  ,  and  assuming  we 
know  the  number  of  targets  n  and  number  of  vehicles  m  >  n  (in  the  case  n  <  m,  we  will  following  the  same 
pattern  in  constructing  merit  functions  until  running  out  of  vehicles),  the  past  and  present  sensor  information 
will  be  used  to  correct  Mp,  and  the  predicted  future  information  will  be  used  to  to  set  up  suitable  maps, 
which  offer  suitable  destinations  for  the  following  vehicles.  We  still  implement  control  law  (8)  and  (x ) 
is  still  the  maximum  (predicted  target)  of  the  merit  function,  but  with  a  new  series  of  merit  functions: 
merit^(xi,X2, A;)  =  Mp(xi,X2, A:),  merit*(xi,X2, A:)  =  M*(xi,X2,A;)  ^ven  (2  <  t  <  AT^),  where  Nt  is  the 
number  of  not-yet-found  targets  {Nt  <  n).  M*Cxi,X2,  A:)  is  called  expanded  prior  information  function  where 
a  future  IMF  is  implemented  to  delete  peaks  to  be  taken  care  of  by  previous  planned  vehicles  (refer  to  Figure 
4).  We  also  have  merit^*'*'^(xi,X2,  A;)  =  D{xi,X2ik)  and  for  iVt+2  <i  <m  merit*(xi,X2,  A:)  =  i^*(a:i,X2,  A:), 
the  expanded  global  distance  function  as  illustrated  in  Figure  2.  Mp(xi,X2,A:)  is  updated  as  in  Equation 
(5).  We  call  this  new  implementation  multiple-attack  PPP  coordination  scheme  (abbreviated  as  MPPP). 
The  illustration  of  the  two  implementations  of  PPP  algorithm  can  be  seen  in  Figure  5,  where  we  have  four 
vehicles  doing  cooperative  search  for  three  stationary  targets  as  demonstrated  in  Figure  1  (a)  and  (b).  The 
blue  stars  stand  for  the  targets.  The  vehicles  under  new  MPPP  coordination  spend  79  steps  finding  all  the 
targets,  while  the  old  PPP  coordination  takes  139  steps  in  finding  them.  A  more  detailed  analysis  based  on 
Monte  Carlo  simulations  will  be  conducted  in  Section  4.2. 
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CxpMdwl  Prior  InforaurtionM^p  Expaixtod  Prior  (nfornutton  Map 


Figure  4:  Prior  information  map,  A;  =  5  (blue  circle  denotes  previous  locus,  red  plus  sign  denotes  planned 
trajectory,  blue  star  denotes  the  real  target  position):  (a)  Prior  information  map,  (b)  Expanded  prior 
information  map  given  one  vehicle’s  planned  trajectory,  (c)  Expanded  prior  information  map  given  two 
vehicles’  planned  trajectories,  (d)  Projection  of  (a),  (e)  Projection  of  (b),  (f)  Projection  of  (c). 


Figure  5:  Vehicle  trajectories,  blue  star  denotes  the  real  target  position  (a)  ICvsSDS,  PPP,  IMF,  (b)  MPPP, 
IMF. 


4,2  Monte  Carlo  Simulation  Results 

A  Monte  Carlo  simulation  is  performed  here  to  evaluate  PPP  coordination  as  described  Section  3  and  the 
new  MPPP  coordination  in  dealing  with  the  uncertain  prior  information.  The  Monte  Carlo  Simulation  is 
used  to  present  a  quantified  analysis.  We  have  the  following  assumptions:  four  vehicles  move  within  a  square 
area  divided  in  a  50  x  50  grid  and  that  area  contains  three  targets;  the  vehicles  are  initially  located  at 
the  four  corners  of  the  terrain  (an  easy  implementation  of  exhaustive  search,  but  not  necessary  for  PPP 
coordination);  kmax  =  625,  which  is  the  upper  bound  search  time  of  exhaustive  search  started  from  four 
corners  given  that  for  simple  exhaustive  search  we  divide  the  map  in  four  regions  of  the  same  size,  one  for 
each  vehicle;  vehicles  have  perfect  communication  and  move  at  constant  speed;  the  targets  occupy  random, 
fixed  locations  each  time  the  simulation  is  run.  Moreover,  vehicles  are  assumed  to  be  free  of  collision  and 
the  maximum  turn  angles  are  ±135  degrees.  Monte  Carlo  simulation  has  been  run  with  50  random  target 
configurations. 

Since  we  encode  the  priority  of  the  targets  as  the  height  of  the  Gaussian  peaks,  we  also  assign  different 


7 


credits  as  indicators  of  the  priority  level  of  each  target  (3  credits  for  the  target  with  the  first  priority  level, 
2  credits  for  the  second  one  and  1  credit  for  the  target  with  the  least  priority  level),  therefore  a  successful 
mission  finding  all  the  three  targets  will  obtain  300  credits  for  50  runs  of  different  target  configurations. 
We  fix  the  uncertainty  level  v,-  of  Mp  to  be  4  grids,  which  is  an  ad-hoc  selection,  and  is  chosen  with  the 
intention  of  being  sufficiently  large  with  respect  to  the  terrain  size  (50  x  60)  to  be  meaningful.  We  assign 
the  uncertainty  of  the  prior  information  by  setting  the  real  target  positions  at  lvu\vi,,,,,Vi,2vi,3vl  far 
from  the  center.  Detection  probability  pa  (defined  as  the  ratio  of  simulations  when  all  three  targets  are 
found  to  the  total  number  of  simulation),  average  of  search  step  fi  and  the  mission  credits  are  chosen  as  the 
performance  criteria.  All  the  results  refer  to  Figure  6. 


Uncertainty 

(a) 

Oetactlon  ProlMiblllty  V$  Uncertainty 


(c) 


Figure  6:  Monte  Carlo  Simulation  Results  (a)  Mission  Credits  VS  Low  Uncertainty  (b)  Mission  Credits  VS 
High  Uncertainty  (c)  Detection  Probability  VS  Uncertainty  (d)  Average  Search  Steps  VS  Low  Uncertainty, 


FVom  Figure  6,  the  two  PPP  coordination  schemes  are  robust  in  the  sense  that  all  of  them  achieve  100% 
detection  probability  given  the  uncertainty  of  the  prior  information  up  to  (Figure  6  (c))  and  much 
better  than  the  exhaustive  search  in  average  of  search  steps  (Figure  6  (d)).  Given  an  uncertainty  less 
than  Ivj,  MPPP  coordination  is  the  best  among  these  three  schemes  since  it  utilizes  most  efficiently  the 
relatively  precise  prior  information  (Figure  6  (d)).  Overall,  as  the  uncertainty  grows,  the  search  performance 
of  the  PPP  and  MPPP  coordination  schemes  degrades  accordingly.  As  expected,  the  performance  degrades 
approximately  quadratically  given  good  detection  probability  since  a  linearly  growing  uncertainty  radius 
leads  to  a  quadratically  larger  search  area  (Figure  6  (d)).  As  the  uncertainty  grows  beyond  l.Sui,  the  PPP 
algorithm  no  longer  achieves  a  good  detection  probability  since  the  prior  information  is  not  anymore  valuable 
enough  to  guide  the  search,  therefore  MPPP  behaves  worse  than  PPP  method  (Figure  6  (a)  and  (b)).  Note 
that  in  those  two  figures  that  exhaustive  search  is  not  affected  by  uncertainty  and  so  always  obtain  the  same 
number  of  credits,  in  average,  whereas  PPP  and  MPPP  degrade  with  increasing  uncertainty.  The  advantage 
of  these  methods  lies  clearly  in  cases  where  pribr  information  is  relatively  accurate. 


4.3  Scalability 


The  decentralized  architecture  of  the  coordination  schemes  we  present  mainly  determines  the  reliability  of 
the  system,  and  it  is  easy  to  expand  it  to  incorporate  more  vehicles  and  targets.  It  is  easy  to  incorporate  more 
targets  in  the  simulation  since  the  number  of  targets  does  not  affect  the  algorithm’s  complexity;  however,  an 
increase  in  the  number  of  vehicles  may  increase  computational  complexity,  especially  in  the  sequential  plan, 
i.e.,  the  construction  of  the  expanded  global  distance  functions  or  the  expanded  prior  information  maps. 
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In  particular,  the  communication  network  is  the  main  issue  for  the  scalability  of  the  system.  An  efficient 
and  reliable  communication  network  will  largely  guarantee  the  whole  performance  of  the  group  of  vehicles. 
However,  the  algorithm  is  still  functional  in  the  case  of  constrained  communication  as  long  as  we  construct 
suitable  merit  functions  (recall  Equation  (7)). 

The  current  implementations  of  PPP  algorithm  require  a  certain  amount  of  communications  (negotiations) 
between  the  vehicles.  If  we  assume  broadcasting  communication,  the  number  of  communications  at  time 
k  for  vehicles  exchanging  individual  data  to  update  the  common  prior  information  map  is  m,  where  m  is 
the  number  of  vehicles  and  the  data  transferred  include  vehicle  position  current  sensor  information 
and  target  found  or  not  information.  Then,  the  number  of  communications  required  in 
the  sequential  plan  for  the  m  vehicles  is  ^  ^  -  1,  if  we  require  link  verification.  Totally,  we  require 

-f  ^  _  1  communication  interchanges  at  each  step.  This  number,  while  not  the  same  as  computational 
complexity,  is  an  indicator  of  it,  and  implies  that  the  complexity  of  the  entire  group  grows  as  a  polynomial 
with  respect  to  the  number  of  vehicles. 


5  Concluding  Remarks 


We  present  a  possible  solution  for  the  multi-vehicle  cooperative  search  problem  via  an  adaptation  of  the 
surrogate  optimization  method.  Monte  Carlo  simulations  provide  partial  evidence  of  the  feasibility  of  our 
schemes  and  yield  several  important  guidelines  for  further  research.  The  PPP  algorithm  is  robust  in  the 
sense  that  it  can  adapt  to  different  scenarios  as  long  as  we  can  smartly  incorporate  the  predicted  future 
information  in  generating  the  control  force.  Better  choices  of  merit  functions  can  be  expected  to  boost  the 
search  performance.  Overall,  the  idea  of  efficient  utilization  of  past,  present  and  predicted  future  information 
becomes  the  solid  foundation  of  the  various  schemes  throughout  the  paper.  The  scalability  of  our  methods 
also  illustrates  the  feasibility  the  algorithm.  The  merit  function  appears  to  have  great  potential  to  achieve 
better  coordination.  An  expansion  of  cooperative  search  for  moving  target  is  not  straightforward  since 
the  assumption  of  IMF  is  no  longer  practically  implementable.  The  mathematical  model  and  heuristic 
coordination  schemes  proposed  so  far  are  a  start  towards  the  control  stability  analysis  of  the  multi-vehicle 
coordination  problem. 
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